Add `CategoricalMADE` #1269

jnsbck · 2024-09-05T10:58:00Z

What does this implement/fix? Explain your changes

This implements a CategoricalMADE to generelize MNLE to multiple discrete dimensions addressing #1112.
Essentially adapts nflows's MixtureofGaussiansMADE to autoregressively model categorical distributions.

Does this close any currently open issues?

Fixes #1112

Comments

I have already discussed this with @michaeldeistler.

Checklist

Put an x in the boxes that apply. You can also fill these out after creating
the PR. If you're unsure about any of them, don't hesitate to ask. We're here to
help! This is simply a reminder of what we are going to look for before merging
your code.

I have read and understood the contribution
guidelines
I agree with re-licensing my contribution from AGPLv3 to Apache-2.0.
I have commented my code, particularly in hard-to-understand areas
I have added tests that prove my fix is effective or that my feature works
I have reported how long the new tests run and potentially marked them
with pytest.mark.slow.
New and existing unit tests pass locally with my changes
I performed linting and formatting as described in the contribution
guidelines
I rebased on main (or there are no conflicts with main)
For reviewer: The continuous deployment (CD) workflow are passing.

codecov · 2024-09-05T11:45:03Z

Codecov Report

Attention: Patch coverage is 92.00000% with 6 lines in your changes missing coverage. Please review.

Project coverage is 78.50%. Comparing base (d3f22b5) to head (ce7c656).
Report is 15 commits behind head on main.

Files with missing lines	Patch %	Lines
sbi/neural_nets/estimators/categorical_net.py	89.36%	5 Missing ⚠️
sbi/neural_nets/net_builders/mnle.py	85.71%	1 Missing ⚠️

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1269       +/-   ##
===========================================
- Coverage   89.40%   78.50%   -10.91%     
===========================================
  Files         118      119        +1     
  Lines        8715     8811       +96     
===========================================
- Hits         7792     6917      -875     
- Misses        923     1894      +971

Flag	Coverage Δ
unittests	`78.50% <92.00%> (-10.91%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines	Coverage Δ
sbi/neural_nets/estimators/__init__.py	`100.00% <100.00%> (ø)`
.../neural_nets/estimators/mixed_density_estimator.py	`94.73% <100.00%> (-3.38%)`	⬇️
sbi/neural_nets/net_builders/categorial.py	`95.83% <100.00%> (+1.09%)`	⬆️
sbi/neural_nets/net_builders/mnle.py	`96.55% <85.71%> (-3.45%)`	⬇️
sbi/neural_nets/estimators/categorical_net.py	`92.42% <89.36%> (-5.41%)`	⬇️

... and 49 files with indirect coverage changes

jnsbck · 2024-09-16T07:32:27Z

Hey @janfb,
would very much appreciate your input at this stage:

Currently the PR adds the CategoricalMADE and builder build_autoregressive_categoricalestimator + some minor modifications to build_mnle and MixedDensityEstimator. This enables multiple discrete variables with different numbers of classes via trainer = MNLE(density_estimator=lambda x,y: build_mnle(y,x,categorical_model="made")) Note that for some reason x and y have to be flipped for mnle.

As far as I can tell all functionalities of CategoricalMADE work for both 1D and ND inputs and running the Example_01_DecisionMakingModel.ipynb with the CatMADE matches the ground truth

The question now is: How should I verify this works? / Which tests should I add/modify? Do you have an idea for a good toy example with several discrete variables that I could use?

I have cooked up a toy simulator, for which I am getting good posteriors using SNPE, but for some reason MNLE raises a RuntimeError: probability tensor contains either 'inf', 'nan' or element < 0 Even for the unmodified MNLE. Any ideas why this could be?

This is the simulator

def toy_simulator(theta: torch.Tensor, centers: list[torch.Tensor]) -> torch.Tensor:
    batch_size, n_dimensions = theta.shape
    assert len(centers) == n_dimensions, "Number of center sets must match theta dimensions"
    
    # Calculate discrete classes by assiging to the closest center
    x_disc = torch.stack([
        torch.argmin(torch.abs(centers[i].unsqueeze(1) - theta[:, i].unsqueeze(0)), dim=0)
        for i in range(n_dimensions)
    ], dim=1)

    closest_centers = torch.stack([centers[i][x_disc[:, i]] for i in range(n_dimensions)], dim=1)
    # Add Gaussian noise to assigned class centers
    std = 0.4
    x_cont = closest_centers + std * torch.randn_like(closest_centers)
       
    return torch.cat([x_cont, x_disc], dim=1)

The setup:

torch.random.manual_seed(0)
centers = [
    torch.tensor([-0.5, 0.5]),
    # torch.tensor([-1.0, 0.0, 1.0]),
]

prior = BoxUniform(low=torch.tensor([-2.0]*len(centers)), high=torch.tensor([2.0]*len(centers)))
theta = prior.sample((20000,))
x = toy_simulator(theta, centers)

theta_o = prior.sample((1,))
x_o = toy_simulator(theta_o, centers)

NPE:

trainer = SNPE()
estimator = trainer.append_simulations(theta=theta, x=x).train(training_batch_size=1000)

snpe_posterior = trainer.build_posterior(prior=prior)
posterior_samples = snpe_posterior.sample((2000,), x=x_o)
pairplot(posterior_samples, limits=[[-2, 2], [-2, 2]], figsize=(5, 5), points=theta_o)

and the equivalent MNLE:

trainer = MNLE()
estimator = trainer.append_simulations(theta=theta, x=x).train(training_batch_size=1000)

mnle_posterior = trainer.build_posterior(prior=prior)
mnle_samples = mnle_posterior.sample((10000,), x=x_o)
pairplot(mnle_samples, limits=[[-2, 2], [-2, 2]], figsize=(5, 5), points=theta_o)

Hoping this makes sense. Lemme know if you need clarifications anywhere. Thanks for your feedback.

jnsbck · 2024-10-22T11:50:06Z

Hey @janfb,
you might have missed this, but I would be happy about feedback :)

janfb

thanks a lot for tackling this @jnsbck! 👏

Please find below some comments and questions.
There might be some misunderstanding about variables and categories on my side. We can have a call if that's more efficient than commenting here.

sbi/neural_nets/estimators/categorical_net.py

sbi/neural_nets/net_builders/mnle.py

janfb · 2024-10-30T09:43:41Z

sbi/neural_nets/net_builders/mnle.py

+    elif categorical_model == "mlp":
+        assert num_disc == 1, "MLP only supports 1D input."
+        discrete_net = build_categoricalmassestimator(
+            disc_x,
+            batch_y,
+            z_score_x="none",  # discrete data should not be z-scored.
+            z_score_y="none",  # y-embedding net already z-scores.
+            num_hidden=hidden_features,
+            num_layers=hidden_layers,
+            embedding_net=embedding_net,
+        )


more generally, isn't the MLP a special case of the MADE? can't we absorb them into one class?

Comment: Check if testcase is identical, if yes -> rm MLP

This seems to be the case for the examples in the tutorial notebook

c2st between true and MADE MNLE posterior: 0.538
c2st between true and MLP MNLE posterior: 0.5730000000000001
c2st between MADE MNLE and MLP MNLE posterior: 0.5734999999999999

that's great! 👍

janfb · 2024-10-30T09:45:45Z

sbi/made_mnle.ipynb

this should eventually be integrated with the MNLE tutorial in 12_iid_data_and_permutation_invariant_embeddings.ipynb

change "mlp" to "made" and comment that several variables with different num_categories are supported.

jnsbck · 2024-11-04T10:17:28Z

Cool, thanks for all the feedback! A quick call would be great, also to discuss suitable tests for this. Will reach out via email and tackle the straight forward things in the meantime.

jnsbck · 2024-11-14T11:43:30Z

sbi/neural_nets/estimators/categorical_net.py

+
+    # outputs (batch_size, num_variables, num_categories)
+    def log_prob(self, inputs, context=None):
+        outputs = self.forward(inputs, context=context)


are these shapes correct?

jnsbck · 2024-11-14T15:00:34Z

After discussion with @janfb I will:

adapt the simulator of Example_01_DecisionMakingModel.ipynb to multiple discrete variables.
Get this to run for 1D and ND
fix remaining comments/issues
possibly refactor (fold 1D into ND code).

@janfb could you still check tho what is up with the simulator above? Do you have a hunch why the SNPE and MNLE posteriors different?

EDIT:

wip
✅
✅
✅
add new tests / update old ones with mutli dim example.

…too. log_prob has shape issues tho

…ting mixed_density estimator log_probs and sample to work as well

…rg to categorical_model

jnsbck · 2025-01-09T16:54:22Z

I did a bit more work on this PR, current tests should be passing and I have swapped out all the legacy CategoricalNet code for the CategoricalMADE. See changes and comments above (please close if no longer relevant).
A few things remain:

Add a bit more docs and comments
add test cases
adapt the tutorial to 2d? (Can just add another beta distribution to the prior)
make sure it runs for ND.

This last thing has been haunting me in my sleep, as I cannot figure out what is wrong. Maybe you have an idea of what could be causing this.
For 1D it works, but for ND it always gets the first discrete dim wrong, i.e. yields the prior (all other dims are correct, see image). I am not sure if the conditioning for the first dimension is broken somehow, but I am not able to pin down where this would be happening in my code.

Help would be much appreciated. @janfb

janfb

thanks for the update!
Made another round of comments. Happy to have another call to sort them out.

sbi/neural_nets/estimators/categorical_net.py

janfb · 2025-01-13T17:52:32Z

sbi/neural_nets/estimators/categorical_net.py

+        outputs = outputs.reshape(*inputs.shape, self.num_categories)
+        ps = self.compute_probs(outputs)
+
+        # categorical log prob
+        log_prob = torch.log(ps.gather(-1, inputs.unsqueeze(-1).long()))
+        log_prob = log_prob.squeeze(-1).sum(dim=-1)


very naive question here: the outputs are coming from the MADE, i.e., the conditional dependencies are already taken care of internally right?

I am just wondering because for the 1-D case, we used the network-predicted ps to construct a Categorical distribution and then evaluated the inputs under that distribution. This is not needed here because the underlying MADE takes both the inputs and the context and outputs unnormalized conditional probabilities already?

The gather is essentially doing the job of the Categorical here. The two lines below should be equivalent. I think using the Categorical , might be a bit easier to understand here actually!

log_prob = Categorical(logits=outputs).log_prob(input) # equivalent log_prob = F.log_softmax(outputs, dim=-1).gather(-1, input.unsqueeze(-1).long())

janfb · 2025-01-13T18:08:38Z

sbi/neural_nets/estimators/categorical_net.py

+    def _initialize(self):
+        pass


Unless I am missing something the _initialize() is needed only in MixtureOfGaussiansMADE(MADE):, not in MADE, so it's not needed here?

janfb · 2025-01-13T18:10:52Z

sbi/neural_nets/estimators/categorical_net.py

+            for i in range(self.num_variables):
+                outputs = self.forward(samples, context)
+                outputs = outputs.reshape(*samples.shape, self.num_categories)
+                ps = self.compute_probs(outputs)
+                samples[:, :, i] = Categorical(probs=ps[:, :, i]).sample()


same question as above: these samples are internally autoregressive, right? So each discrete variable is sampled given the upstream discrete variables?

I am just confused because I would have expected that we for each iteration we need pass the so far sampled discrete samples as context; but this seems to be happening implicitly in the MADE?

Now I see it: in line 148 you are updating samples with the new samples of the current i. It probably boils down to the same thing, but you could also update all sofar sampled samples, i.e.,

amples[:, :, :(i+1)] = Categorical(probs=ps[:, :, :(i+1)]).sample()

?

I dont think it will make a difference whether the prev dims are updated since the dims are autoregressive, but I can make that change

sbi/neural_nets/net_builders/mnle.py

janfb

had another look and made two suggestion which could be a reason for the missing first dim fit.

sbi/neural_nets/estimators/categorical_net.py

janfb · 2025-01-14T18:27:10Z

sbi/neural_nets/estimators/categorical_net.py

+            for i in range(self.num_variables):
+                outputs = self.forward(samples, context)
+                outputs = outputs.reshape(*samples.shape, self.num_categories)
+                ps = self.compute_probs(outputs)
+                samples[:, :, i] = Categorical(probs=ps[:, :, i]).sample()


Now I see it: in line 148 you are updating samples with the new samples of the current i. It probably boils down to the same thing, but you could also update all sofar sampled samples, i.e.,

amples[:, :, :(i+1)] = Categorical(probs=ps[:, :, :(i+1)]).sample()

?

sbi/neural_nets/estimators/categorical_net.py

jnsbck · 2025-02-05T17:55:51Z

Thanks for all the input <3, looking into the remaining ones over the coming days hopefully

jnsbck · 2025-02-06T18:02:46Z

Turns out, since the posterior is an MCMCPosterior, only log_prob is used for sampling and not sample. This bug still eludes me.

I have spent some time today and been able to rule a lot of things out (i.e. posterior, sampling, MNLE related things...), which is great, but nonetheless I am still stuck. I have been able to reduce it to the following example of just training a CategoricalMADE, which means that whatever is causing the weird behaviour can be found here. I am sure I am completely missing something probably obvious. I hope the code below makes it easier to identify.
@dgedon , @janfb .

#... snle tutorial (incl in this PR)

from sbi.neural_nets.estimators.categorical_net import CategoricalMADE

# Define independent prior.
prior = MultipleIndependent(
    [
        Gamma(torch.tensor([1.0]), torch.tensor([0.5])),
        Beta(torch.tensor([2.0]), torch.tensor([2.0])),
        Beta(torch.tensor([2.0]), torch.tensor([2.0])),
        # Beta(torch.tensor([2.0]), torch.tensor([2.0])),
    ],
    validate_args=False,
)

torch.manual_seed(42)
theta_o = prior.sample((1,))

# Training data
num_simulations = 10000
batch_size = 1000
num_epochs = 100
theta = prior.sample((num_simulations,))
x = mixed_simulator(theta)

# only pred disc dimensions
x = x[:, 1:]

made = CategoricalMADE(
    num_categories=torch.ones(x.shape[1], dtype=torch.int32)*2,
    hidden_features=20,
    context_features=theta.shape[1],
)

# quick and dirty training loop
in_batches = lambda x: x.reshape(num_simulations // batch_size, batch_size, -1)
optimizer = torch.optim.Adam(made.parameters(), lr=5e-4)
for i in range(num_epochs):
    print(f"\repoch {i+1} / {num_epochs}", end="")
    for theta_batch, x_batch in zip(in_batches(theta), in_batches(x)):
        optimizer.zero_grad()
        loss = -made.log_prob(x_batch, theta_batch).mean()
        loss.backward()
        optimizer.step()

p_true_disc = theta_o[0, 1:] # theta specifies the true probs
num_disc = x.shape[1]

# compute marginal likelihoods p(x)
choices = torch.arange(2**num_disc).unsqueeze(-1).bitwise_and(2**torch.arange(num_disc)).ne(0).unsqueeze(1)
p_est_disc = torch.zeros(num_disc)
for i in range(num_disc):
    ways_of_choosing_i = choices[torch.any(choices[:, :, i], dim=-1)].float()
    log_prob = made.log_prob(ways_of_choosing_i, theta_o)
    p_est_disc[i] = torch.exp(log_prob).sum().detach()

print("\n")
print(f"true: {p_true_disc}")
print(f"est: {p_est_disc}") # <-- dim=0 incorrect for dim_disc > 1

janfb reviewed Oct 30, 2024

View reviewed changes

jnsbck commented Nov 14, 2024

View reviewed changes

jnsbck added 14 commits January 9, 2025 16:58

wip: first draft on categorical made

71b6eac

wip: forward, log_prob, sample working

abac114

wip: CategoricalMassEstimator can be build and MixedDensityEstimator …

f2db6eb

…too. log_prob has shape issues tho

wip: sampling and log_prob works for categorical_made. working on get…

73e85b5

…ting mixed_density estimator log_probs and sample to work as well

wip: build_mnle works and trains without log_transform. Add made as a…

8bbb764

…rg to categorical_model

fix: categorical_made now trains in 1D MNLE

913776e

fix: change net kwarg

c9e2b88

fix: verify ND training is working with CatMADE.

db03a5b

fix: fix embedding net mistake

b3cefa9

fix: address comments

ba5ae95

wip: save dev nb

ee7a34e

wip: update toy simulator

1d23ae5

wip: save wip

70b287f

rm: rm legacy CategoricalNet

a0f6f44

jnsbck force-pushed the jnsbck-categorical_made branch from 971201b to 8407911 Compare January 9, 2025 16:30

fix: correct i/o shapes, updated tutorial

2e5898b

jnsbck force-pushed the jnsbck-categorical_made branch from 8407911 to 2e5898b Compare January 9, 2025 16:39

dgedon mentioned this pull request Jan 10, 2025

MNPE class similar to MNLE #1362

Open

janfb reviewed Jan 13, 2025

View reviewed changes

janfb reviewed Jan 14, 2025

View reviewed changes

doc: fix input arg dostrings

aedff99

wip: fixes from PR implemented

ce7c656

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `CategoricalMADE` #1269

Add `CategoricalMADE` #1269

jnsbck commented Sep 5, 2024 •

edited

Loading

codecov bot commented Sep 5, 2024 •

edited

Loading

jnsbck commented Sep 16, 2024 •

edited

Loading

jnsbck commented Oct 22, 2024 •

edited

Loading

janfb left a comment

janfb Oct 30, 2024

jnsbck Nov 14, 2024

jnsbck Jan 8, 2025

jnsbck Jan 8, 2025

janfb Jan 13, 2025

janfb Oct 30, 2024

jnsbck Nov 14, 2024

jnsbck commented Nov 4, 2024

jnsbck Nov 14, 2024

jnsbck commented Nov 14, 2024 •

edited

Loading

jnsbck commented Jan 9, 2025 •

edited

Loading

janfb left a comment

janfb Jan 13, 2025

jnsbck Feb 5, 2025 •

edited

Loading

janfb Jan 13, 2025

janfb Jan 13, 2025

janfb Jan 13, 2025

janfb Jan 14, 2025

jnsbck Feb 6, 2025

janfb left a comment

janfb Jan 14, 2025

jnsbck commented Feb 5, 2025

jnsbck commented Feb 6, 2025 •

edited

Loading

Add CategoricalMADE #1269

Are you sure you want to change the base?

Add CategoricalMADE #1269

Conversation

jnsbck commented Sep 5, 2024 • edited Loading

What does this implement/fix? Explain your changes

Does this close any currently open issues?

Comments

Checklist

codecov bot commented Sep 5, 2024 • edited Loading

Codecov Report

jnsbck commented Sep 16, 2024 • edited Loading

jnsbck commented Oct 22, 2024 • edited Loading

janfb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnsbck commented Nov 4, 2024

Choose a reason for hiding this comment

jnsbck commented Nov 14, 2024 • edited Loading

jnsbck commented Jan 9, 2025 • edited Loading

janfb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnsbck Feb 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

janfb left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jnsbck commented Feb 5, 2025

jnsbck commented Feb 6, 2025 • edited Loading

Add `CategoricalMADE` #1269

Add `CategoricalMADE` #1269

jnsbck commented Sep 5, 2024 •

edited

Loading

codecov bot commented Sep 5, 2024 •

edited

Loading

jnsbck commented Sep 16, 2024 •

edited

Loading

jnsbck commented Oct 22, 2024 •

edited

Loading

jnsbck commented Nov 14, 2024 •

edited

Loading

jnsbck commented Jan 9, 2025 •

edited

Loading

jnsbck Feb 5, 2025 •

edited

Loading

jnsbck commented Feb 6, 2025 •

edited

Loading